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SOME  ASYMPTOTIC  RESULTS  FOR  OCCUPANCY  PROBLEMS 

Lars  Holst 

1.  Introduction. 

Suppose  that  balls  are  thrown  independently  of  each  other  into  N 

cells,  so  that  each  bail  has  the  probability  p^  of  falling  into  the  kth  cell, 

p,  + . . . + p%  = 1.  Let  f denote  the  number  of  empty  cells  after  n throws 
I N n 

and  let  T^  denote  the  throw  on  which  for  the  first  time  exactly  b cells  re- 
main empty,  0 < b < N.  The  symmetrical  case  p^  = . . . = p^  = l/N  is  dis- 
cussed in  e.g.  Feller  (1968),  see  occupancy  or  waiting  time  problems. 

Depending  on  how  b,  n,  N—  different  asymptotic  distributions  for 

Y and  T can  be  obtained,  see  e.g.  Holst  (1971)  and  for  the  symmetric 
n b 

case  see  e.g.  Samuei-Cahn  (1974).  In  this  paper  some  remaining  problems  are 

investigated  for  the  nonsymmetrical  case. 

To  give  precise  meanings  of  the  limits  obtained,  double  sequences 

e.g.  <p,  ) ,.(Y  ..L.  are  considered.  But  in  order  to  simplify  the  notation  the 

kN  N'  nN  N 

extra  index  N will  usually  be  omitted. 

2.  A bounded  number  of  empty  cells. 

The  following  limit  theorem  for  Y , the  number  of  empty  cells  after 

n 

n throws,  was  proved  by  Sevastyanov  (197  2). 

Theorem  1.  If  the  p’s  are  such  that 

(2.1)  max  (1  - p,  )n  - 0 

1 < k < N 

and 
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(2,2) 


-*  m < « 


> 


« V * 2 « - pk'n 

k=l  K 

then 

v - rn 

(2.3)  P(Y  = y)  -*■  m • e /y , 

n 

or  equivalently 

(2.4)  Y =>  Po(m)  , when  N-*». 

n 

Remark.  When  the  p's  are  equal  an  expression  for  P(Y^  = y)  can  be  • 
obtained  from  which  (2.  3)  can  be  derived  by  elementary  methods,  see  e.g. 
Feller  (1968).  In  this  case  (2.1)  and  (2.  2)  are  replaced  by 

(2.5)  N ♦ exp(-n/N)  — m < «> 
or 

(2.6)  n/N  - log  N-*  - log  m > -«  . 

For  , the  number  of  balls  until  b empty  cells  remain,  the  limit 
distribution  is  given  by: 

Theorem  2.  If  b is  a fixed  integer  and  for  some  fixed  numbers  C and  D, 


(2.7)  0 < C < Np^  < D < « , for  all  k and  N , 

then,  when  N-*  w , 

N T 

(2.8)  £ (1  - p ) =>  j X (2(b+l))  , 

k=l  K 

and 

N 

(2.9)  J exp(-T  p ) =>  £ X‘*(2(bfl))  . 
k=l 

Before  proving  the  theorem  the  following  functions  are  considered: 

N 

(2.10)  f(t)  = f (t)  = £ U-9J  , t > 0 , 

w ^=1  * 

and 
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(2.11) 


N 

g(t)  - gN(t)  = Y,  exp{-tpk). 
k=i 

Lemma  1.  If  Condition  (2.7)  is  satisfied,  y > 0 is  a fixed  number,  and 

t = t n t(y)  is  defined  by  the  equation 
N 

(2.12)  f(t)  = y, 
then 

(2.13)  0 < C <"  lim  inf  N log  N/ t < limsup  NlogN/t4T  < D < « 

- M — « N~  N“ 

and  when  N -*■  oo 

(2.14)  f([t])  - y , 

(2.15)  max  (1  - p.  0 , 

1 < k < N * 

(2.16)  g(t)  and  g([t])  — y . 

where  [t]  denotes  the  integer  part  of  t . 

Lemma  2.  If  f is  replaced  b\  g and  g by  f in  Lemma  1,  then  the  same 
conclusions  hold. 

Proof  of  Lemma  1.  From  Condition  (2.7),  it  follows  that 

V t , t 

(2.17)  y = 2(1  - P ) > N • (1  - D/N)  . 

k=l  K 

Hence  for  e > 0 and  N sufficiently  large 

(2. 18)  log  y > log  N - t • (D+e  j/N 
and  therefore 

(2.19)  D+e  = (D+e)  lin.  (l/(l-log  y/logN))  > lim  sup  N log  N/t  , 

N-oc  N — ® 

which  proves  the  right  inequality  of  (2. 13). 

To  prove  the  left  inequality  of  (2.13)  the  following  estimate  follows 

from  (2.7): 

N 

(2.20)  y = £U-Pk)  < N • (1  - C/N)\ 
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or 

(2.  21)  log  y < log  N - t Iog(!  - C/N)  < log  N - tC/N 

From  this  it  follows  that 


(2.22) 


C = C lim  (1  - log  y/log  N)"*  < lim  inf  N log  N/t 


N—  oo 


To  prove  (2.14)  we  observe  that 


N -*•  oo 


N 


(2.23) 


(1  - Pk)t-1  > (1  - Pk)[t]  > (1  - Pk)t  , 


and  using  (2.7) 

N N N 

(2.24)  (1  - D/Nf1  £(l-p  f > 2(1-P.)[t]  > £ (1  - P )l  , 


k — rk  — Li  ' r\ 

1 1 1 K 


-1 


or  from  (2. 12) 

(2,25)  (1  - D/N)_1  y > f<[tj)  > y .. 

From  which  (2.14)  follows. 

Combining  (2.7)  and  (2.13)  give  for  some  K.  > 0 and  N sufficiently 


large  that 


KjNlogN 


(2.26)  max(l  - pfc)^  < (1  - C/N)^  < (1-  C/N)  ‘ - 0,  N-» 

which  proves  (2.15). 

Using  (2.7)  and  (2.13)  it  follows  that  for  some  constant  K 
"tpk  t 

(2.  27)  |1  - e VO  - Pk)  | < K • log  N/N  , 

and  therefore 

N -tp 

(2.28)  I f( t)  - g(t)|  < Zd-Pk)  • |l-e  /(1-Pk)  | 

N 1 

< K £(  1-p  ) log  N/N  = K y log  N/N  0 , 

1 

which  proves  (2.16).  • 
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Proof  of  Lemma  2.  The  proof  is  essentially  the  same  as  that  for  Lemma  ■ 


Proof  of  Theorem  2.  From  the  definitions  it  follows  that 

(2.29)  Y < b <=>  T,  < n , 

n - b — 

and  therefore 

(2.30)  P(Y  < b)  = P(T,  < n)  = P(f(T.)  > f(n))  . 

n — b — b — 

Let  y > 0 be  fixed  and  define  n = [t]  with  t = t(y)  as  in  Lemma  1.  According 

to  Lemma  1 the  assumptions  of  Theorem  1 are  satisfied.  Hence 

(2.31)  P(f(Tfa)  > y)  = P(Yn<  b)  -»  P(Y  < b)  , 

where  Y is  Pc(y).  Furthermore  it  is  well-known  that 

(2.32)  P(Y  < b ) = P(i  X2  (2(b+l))  > y ) . 

(2.  31)  and  (2.  32)  prove  (2.8).  Using  Lemma  2, the  assertion  (2.9) 

follows.  » 

Remark.  When  the  p's  are  equal  the  theorem  can  be  written 

T 

(2.33)  N • (1  - 1/N)  b =>  •}  x2  (2(bfl))  , 
and  therefore 

(2.  34)  Tb/N  - log  N =>  log  (|  x2( Z(b+1) ))  . 

This  result  was  found  by  Baum  and  Billingsley  (1965)  using  complicated 
calculations.  Using  the  result  in  Feller  (1968)  and  the  method  of  proof 
of  Theorem  2,  (2.  33)  and  (2.  34)  follows.  A consequence  of  (2.  34)  is 
(2.  35)  Tb  /N  log  N — 1 , in  probability,  as  N-*»  . 

Now  (2.  35)  will  be  generalized.  First  introduce  the  distribution 
function 

(2.36)  H (x>  = » (Pk  : NPk  < x)/N  . 


#1600 
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Lemma  3.  If  t = t..  = t{y)  is  defined  by 
N 


(2. 37)  g(t)  = gM(tN)  = y > 0 , 

and  there  exists  a distribution  function  H(x)  on  [C,D]  such  that 

(2.  38)  Hn(x)  - H{x)  , N - oo  , 

and 


(2.39)  0 <C  = inf  (x  ; H(x)  >0}  , 
then  for  l/C  > e > 0 , when  N , 

( 2. 40)  g ^(e  + l/C)(  N log  N))  - 0 , 
and 


(2.41) 
Proof. 

(2.42) 


g^(-e  + l/C)(N  log  N))-*+«  . 

From  the  definitions  it  follows  that 

0<y  = gN(tN)  = N - J”Dexp(-tjjX/N)dH^(x)  = 


= fD  exp((l-  t^c/N  log  N)log  N)  dH  (x)  . 
c N N 


Consider 

(2.43)  g ((e  + l/C)  N log  N)  = fD  exp((l-x(l+eC)/C)  log  N)dH  (x)  . 

‘ C N 

Now  for  C < x < D it  is  true  that  1 - x(l+eC)/C  < 0 and  therefore  the  expo- 
nent in  (2.43)  is  negative  so  the  integral  tend  to  0 when  N— °°  , which 
proves  (2.  40). 

For  proving  (2.41)  consider 


(2.44)  g„((-£  + l/C)  N log  N)  = fD  exp{(l  - x(l  - tC)/C) log  N)dH„(x)  . 

N N 

For  C < x < C/(1-Ce)  the  exponent  is  positive  and  as  the  integrand  is  positive 


(2.  44)  could  be  estimated  by 


(2.45)  fC/(1'Ce)exp((l  - x(l-eC)/C)log  N)dH  (x)  — +* 


by  Condition  (2.  39). 


¥ 


f 
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Corollary  to  Theorem  2.  If  the  Conditions  (2.38)  and  (2.39)  are  satisfied 


then 

(2.46)  T^/N  *°g  ^ l/C  , in  probability,  N -►«  . 

Proof.  Let  > 0 and  > 0 be  given.  Take  a 6 > 0 so  that 

(2.47)  P(  £ x2  (2{b+l))  < 6)  < e2/2  . 

For  N sufficiently  large  it  follows  from  Theorem  2 that 

(2.48)  P(gN{Tb>  < * > < e2/ 2 
and  from  Lemma  3 that 

(2.4?)  gN((£l+  1/C)(N  log  N))<  6 . 

Hence 

(2.50)  F(Tb/N  log  N > Ej  4 l/C)  = 

P(9N(Tb>  < 9N((£1+  1/C)(N  lo9  ND  -S 

p(gj.(Tb)  < 6 ) < e2/2  . 

In  a similar  way  it  is  proven  that 

(2.51)  P(Tfa/N  log  N < + l/C)  < e?/ 2. 

Hence  for  N sufficiently  large 

(2.52)  P(jTb/N  log  N - l/Cj  > Cj)  < £?  . 

Thus  the  assertion  is  proved.  ■ 

3.  A small  fraction  of  empty  cells. 

As  above,  Yn  denotes  the  number  of  empty  cells  after  n throws. 
Theorem  3.  if 

(3.1)  0 < C < Np,  < D < w,  for  all  k and  W , 

(3.  2)  n/M  - oo  , 

and 
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(3.3) 


J 


N 

f(n)  = E(Y  ) = J)  (1  _ p >n  — +ao 
k=l 

then,  when  n -*  » , 


(3.4) 


i 

(Yn  - f(n))/(f(n))2  =>  N{0,1)  , 


and 
(3.  5) 
where 
(3.6) 


(Yp  - g(n))/(g(n))2  =>  N(0,1)  , 

N 

g(n)  = £ exp(-npk)  . 
k=l 


Proof. 

(3.7) 

hence 


Using  (3.1)  and  (3.  3)  it  follows  that 
N 


7,(1  - Pj"  < N • (1  - C/N)n 
1 


+ * > 


(3.8)  n/N  log  N = 0(1)  . 

Using  (3.1),  (3.2),  and  (3.8)  give 

N 

(3.9)  |f(n)  - g(n)|  < V exp(-np  ) . 

1 * 

• jexp(n  log  (l-pk)  4 nPfc)  - 1 | <_ 

N 

< 7 exp(-np  ) * K-  n/N  < 

1 

< K • (n/N)  * exp(-C  n/N)  - 0 . 

Hence  it  is  sufficient  to  prove  (3.  5).  This  will  be  established  using  con- 
vergence of  characteristic  functions. 

In  Holst  (1971)  p.  1672  the  characteristic  function  of  Y is  given  by 

n 

(3.10)  E(exp{  i t Y ))  = (n  ! / 2*i  Nn)  * 

n 

N 

• 4 (eNz/zn+1)  | | (1  + (e^-Dexpl-Np.  z))dz 
J K 

jz|  = n/N 

Using  Stirling's  formula  and  changing  to  polar  coordinates  it  follows  that 
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.• . , 1 ■J.U'!.-.'1. 11  1 ,,  ■■ .-, . m I ■ 


(3.11) 


E(exp(it(Y  - m )/<r ))  = (1  + o(l))  . 


fn  (n/2ir)‘z  • exp(n(ei9  - 1 - i9)). 


-7T 


N -np 

• | | (exp(-it  e Var)  • (1  + (e 

1 


k,  --  - _ i)exp(-np^  ei0)))de 


= (1  + o(l))*  f h (0,t)d0  , 
J n 

-IT 


wnere 


(3.12) 


N 


= or  = g(n)  = ]]  exp(-np  ),  tr  > 0 . 


1 


The  integral  will  be  studied  by  the  same  method  as  in  Holst  (1971). 
Take  0 < a < 1/6  and  split  the  interval  -it  < 0 < ir  into 

(3.13)  A = {9  ; a < 1 0 j < tt  } , 

(3.14)  B = { 0;  n3*2  < | © | < a } , 
and 

(3.15) 


C = {9  ; | 0 | < na"2  } . 


From  Lemmas  4-6  below  it  follows  that 


(3.16) 


E(exp(it{Y  - p)/<r  ) = (1  + o(l))  . 


00 


(fh  + f h + f h )-*  0 + 0 + exp(  -t  /l) , n -* « 
A " B " C " 

B/  the  continuity  theorem  for  characteristic  functions  assertion  (3.  5)  is 
proved,  and  thus  the  theorem.  * 

With  the  same  conditions  as  in  Theorem  3 the  following  lemmas 

hold. 


#1600 
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Now,  when  n •+  <»  , 

(3.23)  J|exp(-2np  e^Ke*^  - 1)2| 

1 K 

N 

= o(l)  • V exp(-np  )/<r  = o(l)  , 

1 

and  therefore 

N 

(3.24)  ^ (log  (1  +...)-  ...) 

1 

N / 

= V (exp(-np  e1  )(eX  /<r  - 1)  - it  exp(-np  )/<r)  + o(l)  . 
1 * * 

Furthermore,  using  (3.8),  (3.9)  and  the  assumptions;,  it  follows  that 


(3.25)  ]]  exp(-nP  ei9)/<r2  - 1 , 


and  therefore  (3.  24)  can  be  written 

N N ifl  2 2 

(3.26)  £(...)  = £(exp(-np  e )(it/V  - t /2<r  ) 

1 1 


- it  exp(-npk)/o-)  + o(l) 

N 

= it  ^(exp(-np^(e1  - 1))  - 1)  exp(-nPk)/<r 

- t2/ 2 + o(l)  . 

Now,  when  n , 

N 

(3.27)  Y (npk)  9 exp(-npk)/tr  < 

< Kj  (n/N)2  n2a_1  N1  exp(-K2  n/N)  - 0 . 

From  this  it  follows  that 

N N 

(3.28)  Y (•••)  = 9t  7]  nP  exp(-nP.  )/<r  - t /2  + o(l) 

1 1 


moo 


Hence  for  0 in  C , 


N 


^*^9)  log  hn(9,t)  -i  log(2w/n)  = -n0  /2  + 0t  V np  exp(-np  )/o 

I K K 

2 i N i 

- t /2  + o(l)  = -(n20  - t J n2  Pk  exp(-nPk)/<r)2/2 

2 N i_  1 

-t  (l-(  £ n2  Pk  exp(-nPk)/o-)2)/2  + o(l)  . 

Now,  when  n -*  <x>  f 

N i i i 

(3-30)  S n2  pk  exp(-nPk)/<r  < n2  N_1  • N2 

* expi-L  n/N)-*  0. 

4 

1 

Thus  with  i\,  = n20  the  integral  (3.  21)  can  be  written 


(3.31) 


f h = / 

L n J 


i 


(2ir) 


\^>\<  n 


♦ exp(-(4,-o(l))72  - t/2  + o(l))  cty  , 

2 

which  converges  to  exp(-t  /Z)  when  n—°o  . ■ 

4.  The  waiting  time  for  a small  fraction. 

As  above  let  Tfe  denote  the  number  of  balls  thrown  until  exactly 
b = bN  cells  remain  empty.  Let  t be  the  unique  solution  of  the  equation 


(4.1) 


N 


b = g{V  =.Z  ) 


k=l 


Theorem  4.  If,  when  N -*  oo  , 


(4.2) 

(4.3) 
and 


bN  - +”  ’ 

VN  ■* 0 ' 


(4.4) 
then 

(4.5) 


C < C < N P k < D < oo,  for  all  k and  N , 


N 


bN  'VV^R^'-W 

k=I 


N(0,1). 
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Proof.  From  the  assumptions  it  follows  that 

N 

(4.6)  C b/N  < A = J pk  exp(-tbpk)  < Db/N 

Thus  for  N sufficiently  large 

(4.7  ) 0 < C < A*  N/b  < D < » . 

As  in  the  proof  of  Theorem  2 the  following  relation  holds 

(4.8)  P((Tb  - tfa)  Vb2  < x)  = P(Yn  < b)  , 
where 

_i 

(4.9)  n = [tfa  + x b2/Aj  . 

It  is  seen  that 

(4.10)  g(n)  (1  + o(D)  = g(tb  + x b2/A) 

= Y,  exp(-tbpk)  * (i  ‘ xpk  fe2/A  + °(1'/L" 

j, 

= b - x • b2  + 0(1)  , 
and  thus 

(4.11)  g(n)  - + «>, 

and  from  (3.9)  it  follows  that 

(4.12)  f(n)  - +». 

Furthermore. 

(4.13)  b = g(tb;  > N exp(-Dtb/N), 

implying  that 

(4.14)  t,  M — +» 

D 

and  therefore 


(4.15) 


n/N  — +«  . 


Hence  the  assumptions  of  Theorem  3 are  fulfilled  and  (4.8)  and  (4.10)  give 

i 

(4.16)  P(T.  -t.)Vb2  < *)  = P(Y  < b)  = 

d D “ n 

= <M(b  - g(n))  / (g(n))2)  + o(l)  « 

i i 

= *((x  b 2 + 0(l))/(b(l  + o(l)))2)  + o(l) -►  ${x)  , 

where  ${x)  is  the  standardized  normal  distribution  function.  This  proves 

the  theorem.  ■ 
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